Improved Dataset Characterisation for Meta-learning

نویسندگان

  • Yonghong Peng
  • Peter A. Flach
  • Carlos Soares
  • Pavel Brazdil
چکیده

This paper presents new measures, based on the induced decision tree, to characterise datasets for meta-learning in order to select appropriate learning algorithms. The main idea is to capture the characteristics of dataset from the structural shape and size of decision tree induced from the dataset. Totally 15 measures are proposed to describe the structure of a decision tree. Their effectiveness is illustrated through extensive experiments, by comparing to the results obtained by the existing data characteristics techniques, including data characteristics tool (DCT) that is the most wide used technique in metalearning, and Landmarking that is the most recently developed method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Novel ICA-based Estimator for Software Cost Estimation

One of the most important and valuable goal of software development life cycle is software cost estimation or SCE. During the recent years, SCE has attracted the attention of researchers due to huge amount of software project requests. There have been proposed so many models using heuristic and meta-heuristic algorithms to do machine learning process for SCE. COCOMO81 is one of the most popular...

متن کامل

Ranking of Classifiers based on Dataset Characteristics using Active Meta Learning

Classification is a machine learning technique which is used to categorize the different input patterns into different classes. To select the best classifier for a given dataset is one of the critical issues in Classification. Using cross-validation approach, it is possible to apply candidate algorithms on a given dataset and best classifier is selected by considering various evaluation measure...

متن کامل

A Study of Meta Learning for Regression

In regression applications, there is no single algorithm which performs well with all data since the performance of an algorithm depends on the dataset used. In practice, different algorithms / approaches are tried, and the best one is selected in each application. It is meaningful to ask whether there is a different way instead of running such tedious experiments. In meta learning studies, one...

متن کامل

Dataset Generation for Meta-Learning

Meta-learning tries to improve the learning process by using knowledge about already completed learning tasks. Therefore, features of dataset, so-called meta-features, are used to represent datasets. These meta-features are used to create a model of the learning process. In order to make this model more predictive, sufficient training samples and, thereby, sufficient datasets are required. In t...

متن کامل

Improved teaching–learning-based and JAYA optimization algorithms for solving flexible flow shop scheduling problems

Flexible flow shop (or a hybrid flow shop) scheduling problem is an extension of classical flow shop scheduling problem. In a simple flow shop configuration, a job having ‘g’ operations is performed on ‘g’ operation centres (stages) with each stage having only one machine. If any stage contains more than one machine for providing alternate processing facility, then the problem...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002